Add grad clipping, loss history, config accessors, and HYBRID mode to TensorProgram#4
Draft
Add grad clipping, loss history, config accessors, and HYBRID mode to TensorProgram#4
Conversation
…g accessors Co-authored-by: danregima <10253941+danregima@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Proceed with next steps
Add grad clipping, loss history, config accessors, and HYBRID mode to TensorProgram
Mar 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Four missing/incomplete pieces of the
TensorProgramlearning pipeline, all natural follow-ons to the gradient computation work in the previous PR.Changes
TensorEquation.h/.cc_max_iterationsand_convergence_thresholdwere private with no public API; addsset_max_iterations(),max_iterations(),set_convergence_threshold(),convergence_threshold()set_grad_clip(double)/grad_clip(); when non-zero, each gradient vector is L2-norm-clamped inupdate_parameters()before the SGD step; extracted into aclip_gradient_vector()static helper to avoid duplicationtrain()now clears and appends to_loss_historyeach epoch (pre-update, standard ML convention); exposed vialoss_history() -> const std::vector<double>&execute()previously treatedHYBRIDidentically toCONTINUOUS; now applies sigmoid (when no explicit nonlinearity is set) then threshold, yielding binary outputs with gradient flow via the existing straight-through estimator inbackward()Tests
10 new tests:
max_iterations/convergence_thresholdaccessors and their effect onforward_to_fixpoint();grad_clipround-trip and that clipping actually bounds weight deltas to≤ lr × clip;loss_historylength equals epoch count, values non-negative, and loss decreases over training; HYBRID mode produces binary{0,1}output, routes through sigmoid→threshold when nonlinearity isNONE, and skips the extra sigmoid when an explicit nonlinearity is set.🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.